{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Your First Click Model: Click Through Rate\n",
    "\n",
    "This section examines the session data and computes the probability of relevance using Click-Through-Rate. Roughly the number of clicks divided by the number of sessions. Then we examine wheter there's position bias in that data - that is, consider perhaps that some documents have a higher CTR only because they show up higher in the search results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.append(\"..\")\n",
    "sys.path.append(\"../ltr\")\n",
    "from aips import *\n",
    "from ltr.judgments import Judgment\n",
    "from ltr.sdbn_functions import *\n",
    "import pandas\n",
    "\n",
    "# if using a Jupyter notebook, includue:\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.01\n",
    "Judgments with binary grades"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Judgment(grade, keywords, doc_id)\n",
    "sample_judgments = [\n",
    "  # for 'social network' query\n",
    "  Judgment(1, \"social network\", 37799),  # The Social Network\n",
    "  Judgment(0, \"social network\", 267752), # #chicagoGirl\n",
    "  Judgment(0, \"social network\", 38408),  # Life As We Know It\n",
    "  Judgment(0, \"social network\", 28303),  # The Cheyenne Social Club\n",
    "  # for 'star wars' query\n",
    "  Judgment(1, \"star wars\", 11),     # Star Wars\n",
    "  Judgment(1, \"star wars\", 1892),   # Return of Jedi\n",
    "  Judgment(0, \"star wars\", 54138),  # Star Trek Into Darkness\n",
    "  Judgment(0, \"star wars\", 85783),  # The Star\n",
    "  Judgment(0, \"star wars\", 325553)  # Battlestar Galactica\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.02\n",
    "Judgments with probablistic grades"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "sample_judgments = [\n",
    "  Judgment(0.99, \"social network\", 37799),  # The Social Network\n",
    "  Judgment(0.01, \"social network\", 267752), # #chicagoGirl\n",
    "  Judgment(0.01, \"social network\", 38408),  # Life As We Know It\n",
    "  Judgment(0.01, \"social network\", 28303),  # The Cheyenne Social Club\n",
    "  Judgment(0.99, \"star wars\", 11),     # Star Wars\n",
    "  Judgment(0.80, \"star wars\", 1892),   # Return of Jedi\n",
    "  Judgment(0.20, \"star wars\", 54138),  # Star Trek Into Darkness\n",
    "  Judgment(0.01, \"star wars\", 85783),  # The Star\n",
    "  Judgment(0.20, \"star wars\", 325553)  # Battlestar Galactica\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.03\n",
    "\n",
    "Viewing session 2 of query `transformers dark of the moon` in retrotech. Here we inspect one of the sessions. We encourage you to examine other sessions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>sess_id</th>\n",
       "      <th>query</th>\n",
       "      <th>rank</th>\n",
       "      <th>doc_id</th>\n",
       "      <th>clicked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>50002</td>\n",
       "      <td>blue ray</td>\n",
       "      <td>0.0</td>\n",
       "      <td>600603141003</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>50002</td>\n",
       "      <td>blue ray</td>\n",
       "      <td>1.0</td>\n",
       "      <td>827396513927</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>50002</td>\n",
       "      <td>blue ray</td>\n",
       "      <td>2.0</td>\n",
       "      <td>24543672067</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>50002</td>\n",
       "      <td>blue ray</td>\n",
       "      <td>3.0</td>\n",
       "      <td>719192580374</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>50002</td>\n",
       "      <td>blue ray</td>\n",
       "      <td>4.0</td>\n",
       "      <td>885170033412</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74995</th>\n",
       "      <td>5001</td>\n",
       "      <td>transformers dark of the moon</td>\n",
       "      <td>10.0</td>\n",
       "      <td>47875841369</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74996</th>\n",
       "      <td>5001</td>\n",
       "      <td>transformers dark of the moon</td>\n",
       "      <td>11.0</td>\n",
       "      <td>97363560449</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74997</th>\n",
       "      <td>5001</td>\n",
       "      <td>transformers dark of the moon</td>\n",
       "      <td>12.0</td>\n",
       "      <td>93624956037</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74998</th>\n",
       "      <td>5001</td>\n",
       "      <td>transformers dark of the moon</td>\n",
       "      <td>13.0</td>\n",
       "      <td>97363532149</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>74999</th>\n",
       "      <td>5001</td>\n",
       "      <td>transformers dark of the moon</td>\n",
       "      <td>14.0</td>\n",
       "      <td>400192926087</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1775000 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       sess_id                          query  rank        doc_id  clicked\n",
       "0        50002                       blue ray   0.0  600603141003     True\n",
       "1        50002                       blue ray   1.0  827396513927    False\n",
       "2        50002                       blue ray   2.0   24543672067    False\n",
       "3        50002                       blue ray   3.0  719192580374    False\n",
       "4        50002                       blue ray   4.0  885170033412     True\n",
       "...        ...                            ...   ...           ...      ...\n",
       "74995     5001  transformers dark of the moon  10.0   47875841369    False\n",
       "74996     5001  transformers dark of the moon  11.0   97363560449    False\n",
       "74997     5001  transformers dark of the moon  12.0   93624956037    False\n",
       "74998     5001  transformers dark of the moon  13.0   97363532149    False\n",
       "74999     5001  transformers dark of the moon  14.0  400192926087    False\n",
       "\n",
       "[1775000 rows x 5 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "all_sessions()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "sessions = all_sessions()\n",
    "products = fetch_products(doc_ids=sessions[\"doc_id\"].unique())\n",
    "\n",
    "def print_series_data(series_data, column):\n",
    "    #pandas.set_option(\"display.width\", 76)\n",
    "    dataframe = series_data.to_frame(name=column).sort_values(column, ascending=False)\n",
    "    merged = dataframe.merge(products, left_on='doc_id', right_on='upc', how='left')\n",
    "    print(merged.rename(columns={\"upc\": \"doc_id\"})[[\"doc_id\", column, \"name\"]].set_index(\"doc_id\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 CTR                                               name\n",
      "doc_id                                                                 \n",
      "97360810042   0.0824      Transformers: Dark of the Moon - Blu-ray Disc\n",
      "47875842328   0.0734  Transformers: Dark of the Moon Stealth Force E...\n",
      "47875841420   0.0434  Transformers: Dark of the Moon Decepticons - N...\n",
      "24543701538   0.0364  The A-Team - Widescreen Dubbed Subtitle AC3 - ...\n",
      "25192107191   0.0352              Fast Five - Widescreen - Blu-ray Disc\n",
      "786936817218  0.0236  Pirates Of The Caribbean: On Stranger Tides (3...\n",
      "786936817218  0.0236  Pirates of the Caribbean: On Stranger Tides - ...\n",
      "97363560449   0.0192  Transformers: Dark of the Moon - Widescreen Du...\n",
      "47875841406   0.0160  Transformers: Dark of the Moon Autobots - Nint...\n",
      "400192926087  0.0124  Transformers: Dark of the Moon - Original Soun...\n",
      "47875842335   0.0106  Transformers: Dark of the Moon Stealth Force E...\n",
      "97363532149   0.0084  Transformers: Revenge of the Fallen - Widescre...\n",
      "36725235564   0.0082   Samsung - 40\" Class - LCD - 1080p - 120Hz - HDTV\n",
      "93624956037   0.0082  Transformers: Dark of the Moon - Original Soun...\n",
      "47875841369   0.0074     Transformers: Dark of the Moon - PlayStation 3\n",
      "24543750949   0.0062  X-Men: First Class - Widescreen Dubbed Subtitl...\n"
     ]
    }
   ],
   "source": [
    "query = \"transformers dark of the moon\"\n",
    "sessions = get_sessions(query, index=False)\n",
    "ctrs = calculate_ctr(sessions)\n",
    "print_series_data(ctrs, column=\"CTR\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                                 query  rank        doc_id  clicked\n",
      "sess_id                                                            \n",
      "3        transformers dark of the moon   0.0   47875842328    False\n",
      "3        transformers dark of the moon   1.0   24543701538    False\n",
      "3        transformers dark of the moon   2.0   25192107191    False\n",
      "3        transformers dark of the moon   3.0   47875841420    False\n",
      "3        transformers dark of the moon   4.0  786936817218    False\n",
      "3        transformers dark of the moon   5.0   47875842335    False\n",
      "3        transformers dark of the moon   6.0   97363532149    False\n",
      "3        transformers dark of the moon   7.0   97360810042     True\n",
      "3        transformers dark of the moon   8.0   24543750949    False\n",
      "3        transformers dark of the moon   9.0   36725235564    False\n",
      "3        transformers dark of the moon  10.0   47875841369    False\n",
      "3        transformers dark of the moon  11.0   97363560449    False\n",
      "3        transformers dark of the moon  12.0   93624956037    False\n",
      "3        transformers dark of the moon  13.0   47875841406    False\n",
      "3        transformers dark of the moon  14.0  400192926087    False\n"
     ]
    }
   ],
   "source": [
    "query = \"transformers dark of the moon\"\n",
    "sessions = get_sessions(query)\n",
    "print(sessions.loc[3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Listing 11.04\n",
    "\n",
    "Simple CTR based judgments for our query. We compute the CTR by taking the number of clicks for a document relative to the number of unique sessions the doc appears in for that query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "#%load -s calculate_ctr ../ltr/sdbn_functions.py\n",
    "def calculate_ctr(sessions):\n",
    "    click_counts = sessions.groupby(\"doc_id\")[\"clicked\"].sum()\n",
    "    sess_counts = sessions.groupby(\"doc_id\")[\"sess_id\"].nunique()\n",
    "    ctrs = click_counts / sess_counts\n",
    "    return ctrs.sort_values(ascending=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 CTR                                               name\n",
      "doc_id                                                                 \n",
      "97360810042   0.0824      Transformers: Dark of the Moon - Blu-ray Disc\n",
      "47875842328   0.0734  Transformers: Dark of the Moon Stealth Force E...\n",
      "47875841420   0.0434  Transformers: Dark of the Moon Decepticons - N...\n",
      "24543701538   0.0364  The A-Team - Widescreen Dubbed Subtitle AC3 - ...\n",
      "25192107191   0.0352              Fast Five - Widescreen - Blu-ray Disc\n",
      "786936817218  0.0236  Pirates Of The Caribbean: On Stranger Tides (3...\n",
      "786936817218  0.0236  Pirates of the Caribbean: On Stranger Tides - ...\n",
      "97363560449   0.0192  Transformers: Dark of the Moon - Widescreen Du...\n",
      "47875841406   0.0160  Transformers: Dark of the Moon Autobots - Nint...\n",
      "400192926087  0.0124  Transformers: Dark of the Moon - Original Soun...\n",
      "47875842335   0.0106  Transformers: Dark of the Moon Stealth Force E...\n",
      "97363532149   0.0084  Transformers: Revenge of the Fallen - Widescre...\n",
      "36725235564   0.0082   Samsung - 40\" Class - LCD - 1080p - 120Hz - HDTV\n",
      "93624956037   0.0082  Transformers: Dark of the Moon - Original Soun...\n",
      "47875841369   0.0074     Transformers: Dark of the Moon - PlayStation 3\n",
      "24543750949   0.0062  X-Men: First Class - Widescreen Dubbed Subtitl...\n"
     ]
    }
   ],
   "source": [
    "query = \"transformers dark of the moon\"\n",
    "sessions = get_sessions(query, index=False)\n",
    "ctrs = calculate_ctr(sessions)\n",
    "print_series_data(ctrs, \"CTR\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Figure 11.2\n",
    "\n",
    "Source code to render CTR judgment's ideal relevance ranking for `transformers dark of the moon`. In other words, our search results ordered from highest CTR to lowest.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 ctr\n",
      "doc_id              \n",
      "97360810042   0.0824\n",
      "47875842328   0.0734\n",
      "47875841420   0.0434\n",
      "24543701538   0.0364\n",
      "25192107191   0.0352\n",
      "786936817218  0.0236\n",
      "97363560449   0.0192\n",
      "47875841406   0.0160\n",
      "400192926087  0.0124\n",
      "47875842335   0.0106\n",
      "97363532149   0.0084\n",
      "36725235564   0.0082\n",
      "93624956037   0.0082\n",
      "47875841369   0.0074\n",
      "24543750949   0.0062\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "        <style>img {width:125px; height:125px; }\n",
       "               tr {font-size:24px; text-align:left; font-weight:normal; }\n",
       "               td {padding-right:40px; text-align:left; }\n",
       "               th {font-size:28px; text-align:left; }\n",
       "               th:nth-child(4) {width:125px; }</style><h1>Click-Thru-Rate Judgments for q=transformers dark of the moon</h><table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ctr</th>\n",
       "      <th>upc</th>\n",
       "      <th>image</th>\n",
       "      <th>name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.0824</td>\n",
       "      <td>97360810042</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/97360810042.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon - Blu-ray Disc</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.0734</td>\n",
       "      <td>47875842328</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/47875842328.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon Stealth Force Edition - Nintendo Wii</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.0434</td>\n",
       "      <td>47875841420</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/47875841420.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon Decepticons - Nintendo DS</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.0364</td>\n",
       "      <td>24543701538</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/24543701538.jpg\"></td>\n",
       "      <td>The A-Team - Widescreen Dubbed Subtitle AC3 - Blu-ray Disc</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.0352</td>\n",
       "      <td>25192107191</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/25192107191.jpg\"></td>\n",
       "      <td>Fast Five - Widescreen - Blu-ray Disc</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query = \"transformers dark of the moon\"\n",
    "sessions = get_sessions(query, index=False)\n",
    "ctrs = calculate_ctr(sessions)\n",
    "df = ctrs.to_frame(name=\"ctr\").round(4)\n",
    "print(df)\n",
    "render_judged(products,\n",
    "              df.sort_values(\"ctr\", ascending=False),\n",
    "              grade_col=\"ctr\",\n",
    "              label=f\"Click-Thru-Rate Judgments for q={query}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Figure 11.3\n",
    "\n",
    "Source code to render CTR ideal relevance ranking for `dryer`. Ordering the highest CTR result to the lowest."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "        <style>img {width:125px; height:125px; }\n",
       "               tr {font-size:24px; text-align:left; font-weight:normal; }\n",
       "               td {padding-right:40px; text-align:left; }\n",
       "               th {font-size:28px; text-align:left; }\n",
       "               th:nth-child(4) {width:125px; }</style><h1>Click-Thru-Rate Judgments for q=dryer</h><table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ctr</th>\n",
       "      <th>upc</th>\n",
       "      <th>image</th>\n",
       "      <th>name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.1608</td>\n",
       "      <td>84691226727</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/84691226727.jpg\"></td>\n",
       "      <td>GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.0816</td>\n",
       "      <td>84691226703</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/84691226703.jpg\"></td>\n",
       "      <td>Hotpoint - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White-on-White</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.0710</td>\n",
       "      <td>12505451713</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/12505451713.jpg\"></td>\n",
       "      <td>Frigidaire - Semi-Rigid Dryer Vent Kit - Silver</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.0576</td>\n",
       "      <td>783722274422</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/783722274422.jpg\"></td>\n",
       "      <td>The Independent - Widescreen Subtitle - DVD</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.0572</td>\n",
       "      <td>883049066905</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/883049066905.jpg\"></td>\n",
       "      <td>Whirlpool - Affresh Washer Cleaner</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query = \"dryer\"\n",
    "sessions = get_sessions(query, index=False)\n",
    "ctrs = calculate_ctr(sessions)\n",
    "render_judged(products,\n",
    "              ctrs.to_frame(name=\"ctr\").sort_values(\"ctr\", ascending=False),\n",
    "              grade_col=\"ctr\",\n",
    "              label=f\"Click-Thru-Rate Judgments for q={query}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Listing 11.05\n",
    "\n",
    "Computing the global CTR of each rank per search ranking to consider whether the click data is biased by position. We look over every search to see what the CTR is when a document is placed in a specific rank."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "rank\n",
      "0.0     0.249727\n",
      "1.0     0.142673\n",
      "2.0     0.084218\n",
      "3.0     0.063073\n",
      "4.0     0.056255\n",
      "5.0     0.042255\n",
      "6.0     0.033236\n",
      "7.0     0.038000\n",
      "8.0     0.020964\n",
      "9.0     0.017364\n",
      "10.0    0.013982\n",
      "11.0    0.018582\n",
      "12.0    0.015982\n",
      "13.0    0.014509\n",
      "14.0    0.012327\n",
      "15.0    0.010200\n",
      "16.0    0.011782\n",
      "17.0    0.007891\n",
      "18.0    0.007273\n",
      "19.0    0.008145\n",
      "20.0    0.006236\n",
      "21.0    0.004473\n",
      "22.0    0.005455\n",
      "23.0    0.004982\n",
      "24.0    0.005309\n",
      "25.0    0.004364\n",
      "26.0    0.005055\n",
      "27.0    0.004691\n",
      "28.0    0.005000\n",
      "29.0    0.005400\n",
      "Name: clicked, dtype: float64\n"
     ]
    }
   ],
   "source": [
    "sessions = all_sessions()\n",
    "num_sessions = len(sessions[\"sess_id\"].unique())\n",
    "ctr_by_rank = sessions.groupby(\"rank\")[\"clicked\"].sum() / num_sessions\n",
    "print(ctr_by_rank)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Listing 11.06\n",
    "\n",
    "We look at the documents for our query, and notice that certain ones tend to appear higher and others tend to appear lower. If irrelevant ones dominate the top listings, position bias will dominate our training data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# %load -s calculate_average_rank ../ltr/sdbn_functions.py\n",
    "def calculate_average_rank(sessions):\n",
    "    avg_rank = sessions.groupby(\"doc_id\")[\"rank\"].mean()\n",
    "    return avg_rank.sort_values(ascending=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "              mean_rank                                               name\n",
      "doc_id                                                                    \n",
      "400192926087    13.0526  Transformers: Dark of the Moon - Original Soun...\n",
      "97363532149     12.1494  Transformers: Revenge of the Fallen - Widescre...\n",
      "93624956037     11.3298  Transformers: Dark of the Moon - Original Soun...\n",
      "97363560449     10.4304  Transformers: Dark of the Moon - Widescreen Du...\n",
      "47875841369      9.5796     Transformers: Dark of the Moon - PlayStation 3\n",
      "36725235564      8.6854   Samsung - 40\" Class - LCD - 1080p - 120Hz - HDTV\n",
      "24543750949      7.8626  X-Men: First Class - Widescreen Dubbed Subtitl...\n",
      "97360810042      7.0130      Transformers: Dark of the Moon - Blu-ray Disc\n",
      "47875841406      6.1378  Transformers: Dark of the Moon Autobots - Nint...\n",
      "47875842335      5.2776  Transformers: Dark of the Moon Stealth Force E...\n",
      "786936817218     4.4444  Pirates Of The Caribbean: On Stranger Tides (3...\n",
      "786936817218     4.4444  Pirates of the Caribbean: On Stranger Tides - ...\n",
      "47875841420      3.5344  Transformers: Dark of the Moon Decepticons - N...\n",
      "25192107191      2.6596              Fast Five - Widescreen - Blu-ray Disc\n",
      "24543701538      1.8626  The A-Team - Widescreen Dubbed Subtitle AC3 - ...\n",
      "47875842328      0.9808  Transformers: Dark of the Moon Stealth Force E...\n"
     ]
    }
   ],
   "source": [
    "sessions = get_sessions(\"transformers dark of the moon\")\n",
    "average_rank = calculate_average_rank(sessions)\n",
    "print_series_data(average_rank, \"mean_rank\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Figure 11.4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "        <style>img {width:125px; height:125px; }\n",
       "               tr {font-size:24px; text-align:left; font-weight:normal; }\n",
       "               td {padding-right:40px; text-align:left; }\n",
       "               th {font-size:28px; text-align:left; }\n",
       "               th:nth-child(4) {width:125px; }</style><h1>Typical Search Session for q=dryer</h><table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>mean_rank</th>\n",
       "      <th>upc</th>\n",
       "      <th>image</th>\n",
       "      <th>name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.9808</td>\n",
       "      <td>47875842328</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/47875842328.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon Stealth Force Edition - Nintendo Wii</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1.8626</td>\n",
       "      <td>24543701538</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/24543701538.jpg\"></td>\n",
       "      <td>The A-Team - Widescreen Dubbed Subtitle AC3 - Blu-ray Disc</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2.6596</td>\n",
       "      <td>25192107191</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/25192107191.jpg\"></td>\n",
       "      <td>Fast Five - Widescreen - Blu-ray Disc</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3.5344</td>\n",
       "      <td>47875841420</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/47875841420.jpg\"></td>\n",
       "      <td>Transformers: Dark of the Moon Decepticons - Nintendo DS</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4.4444</td>\n",
       "      <td>786936817218</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/786936817218.jpg\"></td>\n",
       "      <td>Pirates Of The Caribbean: On Stranger Tides (3-D) - Blu-ray 3D</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sessions = get_sessions(\"transformers dark of the moon\")\n",
    "average_rank = calculate_average_rank(sessions)\n",
    "render_judged(products, \n",
    "              average_rank.to_frame(name=\"mean_rank\").sort_values(\"mean_rank\", ascending=True),\n",
    "              grade_col=\"mean_rank\",\n",
    "              label=f\"Typical Search Session for q={query}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "                 mean                                               name\n",
      "doc_id                                                                  \n",
      "856751002097  17.0208                   Practecol - Dryer Balls (2-Pack)\n",
      "48231011396   16.1548  LG - 3.5 Cu. Ft. 7-Cycle High-Efficiency Washe...\n",
      "12505527456   15.3526  Smart Choice - 1/2\" Safety+PLUS Stainless-Stee...\n",
      "36725578241   14.7286  Samsung - 7.3 Cu. Ft. 7-Cycle Electric Dryer -...\n",
      "36725561977   13.8932  Samsung - 3.5 Cu. Ft. 6-Cycle High-Efficiency ...\n",
      "883929085118  12.9996     A Charlie Brown Christmas - AC3 - Blu-ray Disc\n",
      "74108007469   12.2940  Conair - 1875-Watt Folding Handle Hair Dryer -...\n",
      "48231011402   11.4734    LG - 7.1 Cu. Ft. 7-Cycle Electric Dryer - White\n",
      "12505525766   10.6500        Smart Choice - 6' 30 Amp 3-Prong Dryer Cord\n",
      "36172950027    9.8758    Tools in the Dryer: A Rarities Compilation - CD\n",
      "74108096487    9.1230  Conair - Infiniti Cord-Keeper Professional Tou...\n",
      "14381196320    8.3308                           The Mind Snatchers - DVD\n",
      "665331101927   7.5138                          Everything in Static - CD\n",
      "783722274422   6.7610        The Independent - Widescreen Subtitle - DVD\n",
      "77283045400    5.9318                    Hello Kitty - Hair Dryer - Pink\n",
      "74108056764    5.1276  Conair - Infiniti Ionic Cord-Keeper Hair Dryer...\n",
      "84691226703    4.4552  Hotpoint - 6.0 Cu. Ft. 3-Cycle Electric Dryer ...\n",
      "883049066905   3.5726                 Whirlpool - Affresh Washer Cleaner\n",
      "84691226727    2.8290    GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White\n",
      "12505451713    1.9124    Frigidaire - Semi-Rigid Dryer Vent Kit - Silver\n"
     ]
    }
   ],
   "source": [
    "query = \"dryer\"\n",
    "sessions = get_sessions(query)\n",
    "average_rank = calculate_average_rank(sessions)\n",
    "print_series_data(average_rank, \"mean\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "        <style>img {width:125px; height:125px; }\n",
       "               tr {font-size:24px; text-align:left; font-weight:normal; }\n",
       "               td {padding-right:40px; text-align:left; }\n",
       "               th {font-size:28px; text-align:left; }\n",
       "               th:nth-child(4) {width:125px; }</style><h1>Typical Search Session for q=dryer</h><table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>rank</th>\n",
       "      <th>upc</th>\n",
       "      <th>image</th>\n",
       "      <th>name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1.9124</td>\n",
       "      <td>12505451713</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/12505451713.jpg\"></td>\n",
       "      <td>Frigidaire - Semi-Rigid Dryer Vent Kit - Silver</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2.8290</td>\n",
       "      <td>84691226727</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/84691226727.jpg\"></td>\n",
       "      <td>GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3.5726</td>\n",
       "      <td>883049066905</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/883049066905.jpg\"></td>\n",
       "      <td>Whirlpool - Affresh Washer Cleaner</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4.4552</td>\n",
       "      <td>84691226703</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/84691226703.jpg\"></td>\n",
       "      <td>Hotpoint - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White-on-White</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.1276</td>\n",
       "      <td>74108056764</td>\n",
       "      <td><img height=\"100\" src=\"../../data/retrotech/images/74108056764.jpg\"></td>\n",
       "      <td>Conair - Infiniti Ionic Cord-Keeper Hair Dryer - Light Purple</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "render_judged(products, \n",
    "              average_rank.reset_index().sort_values(\"rank\"),\n",
    "              grade_col=\"rank\",\n",
    "              label=f\"Typical Search Session for q={query}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Up next: [Using SDBN Click Model To Overcome Position Bias](2.sdbn-judgments-to-overcome-position-bias.ipynb)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
